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The first paper to describe chi-sQuare was " published by Karl* Pear- 
son in 1900, As noted by Cochran \ 1 952) , the chi-square paper was, and 
still is, one of the most important ^publications in the history of 
modern' statistics, # 

? * In 1949 Lewis" and Burke authored ah " article appearing in Jthe 

- f ■ " v ' I - ' 

Psychological Bulletin entitled, "The Use' aftd Misuse ,of the Cfti-Square 

Test,". Their stated aim was to' counteract the improper use of this 

statistic by psychologists. The peper addnessed nine major sources of 

error , Icited examples from the literature to illustrate these polints, 

and caused a stir -among practicing researchers,* Subsequently, the Lewis 

and .Burke paper was followed by Several responses (Edwards, 1950;, 

Pastor^, 195Cr; Peters, 3950) and a rejoinder by Lewis and Burke (1950), 
1 

■ * t 
Since then % a great deal of research has been conducted , on 

* 

chi-$quare procedure and several methods have been developed to handle 
som£ of the problems cited by Lewis and Burke, This .paper is a review 
of that -literature. It is an "attempt to Address the problems listed by 

» T J 

^wis and Burke in light of current knowledge and to .form recommehda- 
tions -regarding the use and misuse of the chi-square test; 

Background 

\ * In .a ^bknpact writing style/ Karl pjear^on used a geometric proof to 
xlerive th* distribution theory for establishing the necessary signifi- 
cance level for testing the chi-square statistic. He concerned himself 
specifically with th£ problem of determining goodness-of-f it and gave 
elght*nymerical illustrations of*t;he ase of this new criteri<m, * It is 

- * * ■ * w * * 

interesting t^o note ( that' he did not show that Jthe limiting distribution 



I 



^ 2 " 

of ttie test statistic is X * This fact was proven subsequently (CrV 



mer, 19^6)* Pearson^ also provided Incorrect degree^ of freedom fof\ 
testing the statistic, * ^ 



Originally t tfte^wlue calculated foe a test stajtistic^wafi compared 
against tabled values such as those in Elderton*s/ Tables of ^Goodness- 
of-Fit CPearson, 191*0, The table was entere^ using n*s r(c>, where r = 
number ^ of *rows and ^c * number'of <*jjgpins* in /the contingency table, ^fet*t 
in 1915 Gr^enwood^and Yule^ published A article on research, into the 
efffeot of inoculation against typhoid and' cholera Jin which they n&ted > 

i\ ■ . ,-'<'. * ■ » " 

ft that a^comparison of proportions should yield the same result as a chi-' 

J? sqtiare test, but it did not^ Unable to explain tfcis discrepancy, they * 

p ||f Stated a preference for the more conservative chi-squarei procedure*. . 

5f ' This same inconsistency was noted by ^wley (" 1 1920) / The determination 

^ of the correbt ^degrees of freedom as, (r*1)(c-1) was shown by Fisher in , V 



h two theoretical papers (1922, 192*0? and confirmed by Yule (1922) and 

J '* ■ . * *■ 

Jf* Brownlee (192*0, using sampling experiments, * ' ' 

~ ' . v • ■ /.. - 

* ,. " ■ . . ; _ ■ ■ * . .• - V 

|S As the use of chi-square procedure "began ,t& grow,' its applica- 

tions and limitations were explored, .Ig^he first o*f three associated 

papers, Fry (1938) presented and explained \the derivation of'the'ehi- 

. r w * \ * * * , 

. square statistic. Subsequent to Fry, *Berksori*s ,(1938) paper pointed to 

j ' ^ " • * 

the f^ct that as the sample" si2e increases, the test * stat istfc; will 

eventually reaoh a significant level; Serkson a ls^ noted that this is 



i * * ■ 

basically an omnibus test of the , hypothecs, of equal proportions. That 



is,*,one could'not locat^the specific source within. 3 design that pro- 

- - ■ ** ' ' * 

duced a significant result. These- two papers'* weVe i*n turn' fop.6wed by a - 
discussion by Camp . (1938) regarding further, interpretation; of ehi- 



-3 - ' 



square, 



Many significant c6ntributions to *both the theory and applications 
of this test statistic followed within the next 30 to, 40 years, Cer- 
tainiy the mos£ important^ contributors include* Karl Pedrson^ himself ,^R. * 
A, Fisher* J, Neyman* and E, Pearsort* A' brief historical development 
c.an be found in HL 0, Lancaster's book ^l^ng'with an excellent bibliog- ■ 
raphy (1969). X V ■ 

% ■ • t 

t t i * * 

j , * * 

The CJil-Square Statistic " ' , 



i ■ 



^Following the lead ot. Lewis and Burke* this paper^ is written with 

the social science researcher in mind. Consequently* the tpathematicalv 

derivations are more appropriately handled elsewhere (Cramer-, 19*46 ; Lan- 

• * 

-caster, 1969). The following basics *of *the derivation , are presented 
following Fry (1938)., To avoid confusion* the symbol X 2 will be'used to 

. distinguish the calculated test statistic "from the tabled distribution * 
represented by the Greek* symbol X 2 * against which the X 2 valuers com- 
pared in hypothesis testing, ■ ( 



Given a population of M independent events with s ^possible* out-* 
comes* the joint probability ?(ft^b£ obtaining nj events i*ti category ti 



nj events in category/ 2, and so on up to n g events tn /category 9 is 
given by the multinomial distribution function 1 



/ Ml . 



n,! <V n 4 ; 



(0 



where s is the number df eartegories. or 'possible outcomes. From this' 

/ " ■ ' ' ' ' \ " >''"'■ 

ejjpres'sion the distribution-function of chi-square is derived. 1 
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; This>J%xa£t formula* (1) is .very ''difficult U> compute. However, a 



'reasonable approximation may be substituted, Jbis is accomplished .by 
three approximations 1$ the formula. The first ' inyplv.es replacing the 



factorials by ^heir Stirling approximations, ■ TM& <produc«& 



The secqfid approximation "is, equivaie 



nt to replacing ( I + £ ) m 



by.^ .for. large, ri/t In this case the result is 



* 2 



The tharcl' approximation cpnsists of replacing £he . sum of the discrete 



probabilities of each flj by an integral, 



well known £5 



The computational result is 

■ 5** 



where X| 4^ the "observe^ frequency in class i- and e ^ is the expected 
frequency *in -class i w)iich t equals npj, * 

. Two items* whictfc>enter- Into the discussion concerning the'profSer use 

i. * r * ■ * ■ * 

, of * £ft!r~sqvar£ sho/jldi ybefv nbteji at this poiflt, First, to employ the 

r , * "V 4 
undefrlyiog tnultfnoirt^al distribution , the assumption that the x i are dis~ . 

' < *" * * i"* * ■ ■** ^ v * 

■ trlbuted -gorm^iry^ik necessary* This . means, that the expected values (ej) 
* must^bfe sufficiently ■ ^arge' ienougfr for ,tHe approximation to be ade- 
fluaj^* v Second * equation" .0) alsb requires that,, each of the 



probabilities in that expression be independent. ^This Implies that the 

terms 'which are Summed in equation (4) must be independent of each 

otjier . ■ > " * " 

i 

§ * /■ 

The Use egd Misusp bt " Chi_-Square 

7 *■ ' ' ' '• ■ 

LeKis and /Burke centered their 194<9 article around nine, principal 
sources of error they found <in/ their review of- published research* 
Those nine ^ourcjss are: 1 

'1i Lack of independence among single events or measures 
■ 2) JSnfall theoretical 'frequencies * 

e ■ ... V ■ " T 

J- 3T/ Neglect of frequencies of non-occurrence * 

/ *M Failure to equalize tfte sum of the observed frequencies and thre 
sum of the theoretical frequencies • / 

5) Indeterminant theoretical # frequencies v » " ■ 

* * "" ■ + 

6) Incorrect or questionable categorizing. & 

■■ ✓ * . . 

7) Use of noil-frequency dsta * ^ , v 
* < ■ : ■ " ' ^ %* 

8) Incorrect determination of the number of degrees of freedom 

* 1 z, 

9) J Incorrect 'computations ^ \ # 

This paper will eddfess each of these Issues and then consider som.e- 
■ * - \ ^ 

aspects or the chi-square procedure, tftat* Letfis and Burke .'did not list ai 
sources of error. ■ * 1 ■ * / 

Lack of Independence Among Single Events or Measures f% 

^ * ■ ■ » \ " ' * 

In order for the limiting distribution of be' it is nec^s-" 

sjary that those events or measures from wftictr X 2 is "calculated *be- 

\ r* , . - . , ■ : V 

independent** Thiols so because it, is the jqint probability *of n 
Independent events that is gi%en by the multinomial distributidp^func,- 



In*yesigns involving single subject research,* or repeated measures 

* • 'i 

on several subjects^ this lack" of independence is o,bvious, But often a 
, tack of independence is riot noticed* particularly when 'the final X 

valtie" is the result of the addition of several other X^s. A stibtle yet 

* ^ * / \ * j . 

* \f V * ft 

telling example is cited ,by Xewi^and Burke, early In their paper* * 

".Jn-'a hypothetical experiment twelve dice were .thrown time^ and 

the/ number of "pnes" appearing on eaqh throw were r^cordedv The test 

statistic was calculated by summing, tpe quantity * — — for each of 

■ * \ 

the throws* \ The problem Kith this procedure is that the same twelve 

' ■ , ■ f - 

dice were thrown each time, 'There is no independence ketvjeen tfre terms 

■[ " ' " ' - 

which are summed. Therefore, statements' pertaining to any population,. 

! " * 

■ *" * 

other than the 12 dice themselves, cannot be meaningfully made. If one 

1 i ' * / 

wished to genferalize\the results beyond these twelve dice, then a new 

sample must be <^rawn. In his response" to Lewis and Burke,, refers (1950) 

makes this poi^t r and remarks. thfct *,a lack of gerj'eralizability *to a popu- 

lation is probably not too useful to most researchers. But Peters holds 

' f ' j ■ : 

firm in statirig that if one is concerned with these dice, or subjects, 
then repeated measures are appropriate* 

> ■ - ^.vT 

Small Theoretical Freqqencijpg &X^ 



One of the most controverstal>'aspects regarding, the use of .the 
chi-square^ procedure is the establishment of ^'minimum expecjted value, 
That^is, k ,yalue. below whi^h t^b smallest expedted frequency may not 
drop for/ the application of the test to be appropriate, / This is 

■ , 

required by tljie use, of the three approximations in the derivation; In 



-7- , 

order -for a calculated X 2 to approximate X 2 it is necessary for the sam- 
ple io be 'of sufficiQ)t size to make tljose approximations- reasonable, 
This is Yeflpcted by the expected value 'in each cell, , 



Lewis, and Burke called the use of expected frequencies which are 
tpo small the most common weakness in the use of chi-square (p. tf60). 
2n their paper ^they took the position that'expected values of five were 

* F * 

probably too low. They stated a ( preference for a minimum expected value 
of 10 with five as the absolute lowest limit/ Lewis and Burke subse- 
quently 1 cited two published. Studies each employing ~& cni-square test 
•wittf expected values below 10 as examples, Jt appears £bday that their" 
position* a popular, one among researche^^ may be overly conservative. 



This problem has been examined * from two .different peVspecti ves. 
One may consider this issue in ^relation to the -use of chi-square for 
testing goodness-of-fit. In this approach* as the categories are chosen 
afbitarily* the^ researcher has jcpnfcrpl over the siz'e of the "expected* 
value by choice of the category size'. ^In contrast / the . categories of 
contingency tables are relatively limited anyone , is- forced tb/ifcr^^se 
the expected values by increasing the* sampie^r^si.zg and/or .collapsing 
rows and/or columns. Hcfaever, it is often difficult, ir-not impossible* 
to collect more data to increase N, Collapsing columns „afnd/or rows is 
in effect throwing away information. Additionally* the information is 
in. an ar^af the extremes, where differences are most fikely to occur. 
Research' taken from the "perspective of this . l^ter "case will be cop-- 
sider'ed first. ' ' - " 



Recommendations vary a great' deal*.^ Kendall .(1952) prefered 
expected frequencies greater ttvan 1 20. CrameV (19**6) has recommended 
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values be greater th£t? 10, Fishfer ^ 1 938)- preferred lowest, value of 

'five,.; Jeffreys (1961), Slakter (1965);, and Kemtrthorne (1966) set one as 

the minimal expected frequency allowable. Wise (1963) has takVn the 

/ stance that the expected cell, frequencies could be quite small if they 
* * • 

are rrearly equal ""to ^ach . other. In fact, Wise recommended small 'but 
equal expected frequencies .over the case whetfe a few expected values are 
„ small &nd the remaining frequencies are well above most criteria. 



^ „ small &nd the remaining frequencies are wel^a 



* ' In a 1952 article Cochran suggested t;hat instead of a single value, 

* ■ K * * : ■ . - • / . * ■ 

,£he application of chi-square may be deemed appropriate if no more than 

* *. 

. 20 % of the cells ''have expected values between one and fiver' Good/ 

. . * ' i 

Grover, and Mitchell (1970) concluded that in the case where each of the 
' s categories 'has a probability of 1/s, an equiprobable distribution, the 

^lapproximatior) of the *'tefet "statistic to the chi-squ'are distribution is 

* # 

adequate even frhen the expected values are as low as 1/3 (p. 275 J * Tbis 
apparent robust nature of* the procedure is also supported by Lewontirt 

anfc Felsenstfcin (1965), They used Monte Carlo , method^ to examine 2.x N 

* ' * ■ * 

tables with- fixed marginals* When, tfre expected values are small in each 
* -- i '* * * 

*\ * ' 
cell the authors concluded, that * the test tends to be conservative when 

the degrees of freedom equal , or, exceed/ five, Lewontin and Felsens^ein 

* * * r * * 

found that even the occurrences of expected values below on* generally 

* * * * 

do not invalidate the procedure* > % H i' 

The examination of this tfroblem from the more flexible perspective- 

** . ■ <* 

of the use oft chi-square in the' case o£ testing goodness-of-fit,has pro- 

* * * * 

duced some interesting, results* ; Kendall' and Stuart, (1952) following 

A', f 

. suggestions by Mann and Wal'd ( 1 9**2> and' Gumbel. ( 19il3) v peeomniehded that 
one choose categories so that eafch'hdfc an expected frequency equal* to 



the reciprocal of the ^umber of categories. They prefer a minimum vaiue 
of five, # In 1J)f>8, SXakter presented the results of a Monte Carlo study 
concerning ^the accuracy 'of an approximation of power foe the chi-square, 
goodness-of-f it test with small but equal expected frequencies* He used 
various t combinations of sample size/ number^ of categories* a ; nd Type I* 
error probability lfcvelfi* The results confirmed his earlier work 0 965* 
1966) *nd ttre-\work of Good (1961)" and Wise (\963) which indicated that 
the nominal ajpha level does not deviate substaptially when the expected, 
values are small but„equal. . ^ 

In an* srtiche based on his dissertation* Yarnold ( 19T0) numerically 

- ■ ' - • ^ 

examined the accuracy of approximation of the chi-square jjoodne&s-of~* 

'fit. He proposed that* "If "the number of classes* s,'is three or raore^ 

\ ■■ ' * ' 

,and if r /denotes the tiumber of expectations less than five* then the 

♦» 

minimum expectation may be as small as* 5r/s" (p. 865) t • The remainder of 

his paper *deals with a new approximation technique used to stiJdy the 
• * - ■ 

proposed, rule* In conclusion, <he stated that t "One of the main conclu- 

♦ * ' * / *' ' *-J ' * - ' v . • 

sibns of this article is tfiSt the upper one and five percentage points 

of the X 2 approximation can txe used with* much smaller expectations than 

previously considered possible (p, 882), . • * * ' 

After considering earlier work/ Roscoe and Byars i 1971 > recommended 
that for the examination of good life ss -of- fit with 'more that one degree "ibf 
freedom,- one should be v concerned with the "average expected value. In 
the uniform case* that is ^, equal expected <ie\\ frequencies, they suggest 

- * \ ; * 

an average value of two or'more for an -alpha' eqtial to ,05 and four or 
more 'for an alpha ;equal to ,01 t /<They exhort the use of this- average 
expected value ru}/e in the test for independence as well, even^when "the 

; * ^ * * * * i ' • * 



sample si-zes are not equal. 



*" The advantages of 'several goodness-of-fit teSts for discrete daW^ 

were revised by Horn (1977)* Itfso doing ~she .pointed out that Roscoefi 

^anci'-Pyars rule isiri agreement with . Slakter* s (1965, 1966) suggestion 

that what may be most important is the average qf the expected frequent 

>\ . > jL ( * ' * ^ . - / 

cies/ She also noted that this subsumes Cochrants rule that -20?"of the 

expected , fluencies should bfe greater than one, * 

* ■ * . . " . r ' * C ' 

Thtfre* is a further point which sKtfuld be mentioned. As Horn points 

- ■ , '■' > " -v. * 

oyt, the chi-square goodness-of-^Lt test is an approximation in two 
ways* It approximates the exact multinomial goo^ness-oT-f it test and 
its distribution is an approximation to the theoretical chi-square. dis- 
tribution', ^he studies cited above are concerned with the second form 

* % " ' "* ' ***'*' 

of approximation r Tate and ,Fyer' (1973)* have explored the accuracy of 

f * * - 

approximation in the first form. They stated n that "Jo justify the k 

t te$t because the X 2 distribution tails off similarly to the [theorefci- 

* * ' ' .J ,» * 

cal] chi-square .distribution is to" assume that chi-square is itself an 

accurate approximation to the multinomial" (p, 837) . / * 



" Tate' and Hyer (1969) generated 162" various multinomial distribu- 
ti<yis and compared then to^chi-square values. Their 1973 paper examined 
their data mere closely, - They concluded that t the chi-square procedure 
produces fal^e results f$r a given alj)ha.whenexpected tf&l«u£s droph£Low 
10, They molted that the degr'ee of, accuracy required will ^M^Sjom 
situation to situation * When close approximation to the exaet multino- 
mial Is needed chi-square should only be used when -th# expected fre- 
quencies are ab&ve 20 or so, 

" •. " ■ • ■< v- ... 



,o?r ■ ' .12 . 



Mo^t recently, Ov^rai^ t (1980) has examined . the eYfect ■ of- low 
mi expected frequencies' in one row or column ojf 2 x 2 desigg&on the power 

of th^ cbi-square statistic. This' most often results from the analysis 

* 1 * * 1 . ■ * * 

of * infrequently* occurring events. Setting 1 -8= .JO as a minimally 

. a ■ ' ; ■ * * ' ' ■ / * 

acceptable l^yel f Overall concluded that when expected values are quite * 
low* thev power Y of ^he chi-sguare test "drops to a lev^l thtet produces a* 
, statistic which, in his view, is almost useje&s* Further considerations 

* ' ^ - : \ -* " * ' - - " , ✓ " * \ - \ 

>eg&PdingUhe power of* chi-square may.be found, in Cramer O^ 6 )* Bennett * 

. atf<l t Hsu (I960), .Harkr&ss and Katz (196**H Chapman and Meng (196S), and* 
* * ' +' * ~ ^ ' 

Brof f itt* and Rarities CI 977 > - ' * " ' 

# , , ' . 

, " w * * i 

c , * ' 

s As a general fule it seems -that the ohi^square staMstie*|iay be 

properly ujed in tfase*s where the^ expected v.eXyes are much lower ^han 

■previously considered permissible* although* this * is hot always; true as 

Tate. and Hyer -and Overall have --shown-* The practitioner must take injto 

consideration the , level oT precision required by his work! The closer 

H ^ ^ , * " * i ****** 

4>ne desires to be to the exact probabilities of the multinomial, thV 

* larger the sanfple si-zes *and\xpected -values must be- Ifbr mpst applica- - 
tiori&, : Cochrane rul* which "states thai ^11 expected ^values be greater 
than one f /and "not-, more -than 20$?be less thfin five/ offers a fair balance » 
between practicality apd- precision . Th6 more exploratory th^ research,** 
the more one map relax this rule** -It also seeih^ apprdprlate to relax 

* * ' - . , * * * i \ - 

thi^ rule if the^fexpeoted values ( though .small, are roUgnly. equals 

fleglect .of Trequencies of WoM)ocurrertce and Failure 1 to Equalize the Sum 

of Observ ed and Expected Values ■ ■. 
— zz £ * • 



-In his rfeply to Lewis and*Bu^lce f Peters*- (1950) took exception to 
.the*, propriety of claiming that tbese aspects * are sources of erf or „ 



Peters statj^ that one*? research guestfipns determined whether the fre- 
quencies'of nonnSwurrenee should be included 'in the calculations* He 
further stated that It Is the - , computational formula '<■ 



V * - ? - ' v 1 ■ i 

which " requires **that *th^ sum., of # bbdeVved and expected frequencies be 
equal / but 'not '\the ;g£nera"'lfze(l de0nltion iihich Includes "the true popu- 



. lation mean 'and variance..' tffi * • . 
v . r ' *> . ' 

Xn their Tg'joinder, Leyts and Burke (1950). shaw why they were 

/qorcec&'in lisfcingt-both of these poipts asXerrors,, 'The basis is proof 

* of- flTundfcrlytng therein o3T chi-sqcfcre Shown by prater ,<194'6) and its 



generalization: Given .the esumptioti that the frequencies fbr\air possi-, 
* bl^ outlines are^'yseA and that the 3um of tjie observed "frequencies * 
equals the, sum of the theoretical* Cramer, s proof hoXd*"for^equatlon t (^), 

y - , *> - , r ■ * - p 

/ * Therefore^ in ali applications of thii formula* %he sum of the 
observed frequencies musfe eqOal the. sum of .the expected", Irr addition* 

"frequencies for all of^the possible outcomes , must be included in the 

■ - ^ . * * 

calculation^ Jhat isv in a te^t. of tjie homogeneity ^f several groups. 

■ ba'stfd on 1 the number within »each group having a certain property* the 

«■,*** # , * ■ ^ * - 

calculation of chi-square must inciydte the frequencies of thpse safop^e 
members, within each group* who, do not have, that property. * 



A recentf example* of -this >as 6ited by Slighter and Harascuilo 
(Note, 2), In 1979^ Scftieuneman presented a method for assessing 
test itemg 'using a modifiedL/chi^square procedure. Basica] 
tsedure "involves dividing the number of correct responses- to an item, by- 



essi\ig Mas in 
ailyi, -th\ prb- 



group, into -several categories based on total raw acor$* A chi-sqyare 
is then calculated to te'fet that the proportions passing an item* within, 

• / ^ - ; * 

the ability categories* are the same for each group. Arq&bm Is defined' 
as biased if the'chi-square value is- significant. " - v • 

Slaughter and -Mar^scuilo point out* for thre. statistic to approxi- • 

» " . .„ ' ■ ^ - \ t 

mate the chi-squai*e distribution, its calculation must include the frV-*^ 

quency in eactf group that failed^ the item. Scheuneman mak^s referefrcfc 

L m * 

to this whfcn she" states* ' * 

"It should .be noted*, however* ^that because the modified pro- 
cedure does not include incorrect responses* the Obtained dis- 
tribution of chi-square values may not always approximate the 
"chi-squar,e distribution t particularly if the sample sizes for 
the groups being compared are quite different* or the cell 
frequencies are ver^ large" (p« .^7). *** 
Eut she does not -in^pate why she does not' use the more exact 
procedure. * ■ , - * . 

Slaughter ai%*.Karascuilo demonstrate the proper -tStt '<|f tfrfs 
procedurfe and iriMc^j that a substantial number 0 f the^^rfents 
jiidged as fair* fl by her definition* are in fact biased, Given that 

this method of- assessing item bias is itself somewhat roughs there 

* * " ' .. 

is no justification^ for weakening it even further' by excluding the 

* + v 
incorrect responses as Scheuneman proposes. 




Ifyieterminant Theoretical Frequencies 



It is possible that the theoretical frequencies*/ the* expected 

" ■ . .V ' ■ ■ ■ - ' . ' ■ 

values- against which each observed value is compared* may not be 



calculable. ^ Lewis and Burk£ have illustrated such ,a case in' a 

hypothetical cotn^guessfngj experiment in which subjects recorded 

■ ' • 7" * " [ 

their guesses as to Aether a head or tail would' appear on each of 

four tosses.* The -number of ^corredt guesses, ranging from zero to 



four wer£ oomp^fed- .to tttft expected frequencies, generated _by the 
binomial distribution^ functipn, 'Since the four guesses of each 
subject could -not Jpe considered independent of one another, the 

theoretical distribution is clearly not binomial, -In- this ^case the 

* fl * ' fv 

most one could dp would"be<to test the obtained distribution values 

agains£\values givei) Uy some oth£r Yesearch using* the same-expe'ri* 
* ■ * ■ ■ 

mental design, * % " ' 



■ As can be seen, thte problem of indeterminant theoretical fre- 
quen.cies arises/ in the test foi* .goodnesS-of^f it where category 
choice £s arbritrar^'. In their* final par^M(p on this subject, 
UewiV and Burke pfMjr'Y guideline for deciding if the theoretical 



*7 



frequencies are indeed*^^culabl,e*- They state* 



^Tt is usually tri^e that theoretical frequencies are 
incalculable if, tfye observed frequencies are in any way 
'^related, and also if mutually contradictory pssumptioos 
can 1>e m£de*, with" about* 1 equal justification, concerning 

-the likelihood of ooturrepce or non-occurrence the events 
(resonses)| ^that yielded the observed frequencies" 



Incorrect £P ^Ques't:Tonable Xategprising ' * * 

* 111 dpadding* upkn* tlie categoric' to be used, care must be £ak£n 
* / * * ' * * %" 

' . u * ' 

iri, their selection esp^cTaliy tipen the choice is arbitary. The 



v&luk; of the .test sfc&tistiq will be uhauly inflated if one or more 
the categories contains- a, substantial Humbert -of obervations in 
,oni£ one .cell of that category, Lewis and BuVke provide an- excel- 
lent example of thir$. , . .* , ■ 

* Ii) *a study comparing the drawings of normal and abnprmal sub- 

* j * ' r * , ■* / 

■ i ■ 

jects* one f of the" categories for classifying the drawings was 

labeled, "fantastic ^compositions"** As one woul<J expect, all 26 of 

the drawings placed, to "this class were drawn by abnormal subjects, 
■ * + ■ 

The ipdiv^dual X 2 value for t^is group (2^0) accounted for 25? of 

the total X^ (99*6) even t though only St of the total frequencies 

* " + ■ 

fell into this category', LeWis and Burke offer two general rules 
to* follow -which should h£lp in dealing with this problem: 1) 
categories, for. frequenoywdata should be established, whenever pos- 
sible, on th?" tfasis of completely external criteria, andj2) infor- 
mat ion on *the reliability oV the categories should -be offered. 
This becomes very important as the choice of categories becomes 
more and more arbritary, A .carefui. Jogical lamination of ^ study 
design* such, as thje one mentioned above! may not always be Rossi- 

, Use of /^-Frequency Data * . * * * 

A sittple exaniple^will show 'that the formula, * 

can only be *apEp,ied«io frequency data, Given an observed frequency 
*pf four-and-'an expected frequency, o'f 'two we have for the single 



term ' • * * 

/ • • ■■ • 

1 * * "» 

Let us assunje that the four and two are measures' on some scale 

suchfasj pounds* inches, or even a ratio such as errors per minute. 

If one were to cnange the scale o^^measurement converting pounds 

to ounces, inc^s to, feet, or errors per minute, to errors pen<>0 

► ■ » , f 
seconds* the v^iue.of Sll terms calculated would change'by the same 

factor. Thus, 'for example, to 'double the. number of units In the 

scale of measurement would change ttie observed value of this 

hypothetical example from, four to, eighty tti* expected value from 

two to four and the resulting single term woufcd equal 



The chi-square value wourtf be doubled # saley by* this change in 
metric. 



0 



It must be made clear that thi£ *,s not to say that- either the 

chi^square statistic or theY function .if, its limiting distribution 

' * , ' * r "> 

are derived from, or re£er only to. frequencies, , However, the com- 

* * 

putlrig formula C 1 *) can only property be . applied to frequencies, of 

* * * 

independent observations. v , 

• * ■ ^ * * • 

Incorrect Determination' of the Hum be r 'bf Degrees of Freedom ■ 

One way to interpret th> number of degrees oY freejtom associ- 

. * ' x • ' • 

« . t O 



ated with *a contingency table is tq note - that' it represents the 
mmj>er.of independent pieces of information contained in jth£- sample 
about the truth of an hypothesis under test. That is, lf»we*have a 
set of K numbers which may take on any' values with the restriction 
that they 'add to- a given value t then of them are free to Vary, 
The one remaining value is determined as it must be that/single 
value which* when added to the sum of the N-\ numbers* equals the 
value given by <the restriction, Tfuis, N data points with a -singly 
restriction have- *M degrees freedom. Every restriction imposed 
decreases the available information contained in the data. 

For a contingency- table with r rows and c columns* the degrees 

/ 

of fredom equal (r-Dlc-1)-' This holds regardless of*vhether one 
is testing two variables measured on. a single group for indepen- 
dence, or whether one has ,C groups which are being tested for homo*- 
geneity acro^ B " rows or categories. But thi$ is true for dif- 
ferent reasons. Marafccuilo and HcSweeny (1977) prAent a''discus- 
on of this anjj thfe following* is taken from their presentation,^- 

. * In the .test of homogeniety, o^ahas an r x c contingency table 
■ wh£re the , number of columns* c^sqr responds to the number^f 
independent samples . AS the expected frequencies of the r, 
categQj^es for sample c must* add to n lC *, there are (r-1) degrees* 
of freedom in that one sfimpley For" the c samples, there exist 
degrees'of freedom. In addition, the r prpportfofcs are unk- 
pown and must be estimated*. As they must sum to unity* (r-1) of 
them 3re free tq vary. The degrees of rteedbnj f6r the- entire 
table, therefore, equal c(r-1)-(r-1> * (£-1>(c"-1). * ' 



' In thV test- for independence, only a single sample size, is 
Ht\o>m. - .Th^' frequencies must sum to .this value leaving (rfe-1) 

* * ' * * * 

<iegree£ o t f 'freed cpf^n the case of two variables, the sum* of their 

. ' . • s 

probabilities must add to one,. For r levels of one variable, (r-1) 

are : free to vary* For c levels of the second 'variable, (c-U need 

to be* estimated* The entire table thus hafe 



degrees of freedom = (rc-D-(r-l)-(c-D ^ 



\ 



= (r-i)(o-l) 

■i 

Incorrect Computations 



Mechanical* errors^side, any of >the aforesaid errors would 

* ■ , * 

lead, in effect, to arr J.ncorrecly computed tfcst statistic* Lewis 

and Burfce noted one f computational error in particular *tfrat is easy 

.tfr make ,an^ ,shOuld ^e'^guarded against, -\ This error involves the 



fairure>to7weight by n when proportions are used instead of Tire-, 
quehcies*. ■>** , ^ \- * . " - , ■ - . i- 

As* ^etttijrie'a previously--* ; a^/chi-square value calculated on 
B^Vfre^&ncy 4 cJara can" be altered by a change/in 1 s^a^e. piven t>he 



saifte^data 



a change fYonv meters t$ ceptim^ters will increase 4 thf' 
value* -6f\ cM-squar£ by a Tactor'rofV 100 1 As ,a proofrrtion is*£tie, 

* * * *** JQr I,-,' ■ -Vj' i/. * 

♦ ratio of observed /frequensy^o v tfttal , a 'chi-square calculated oh 
-proportions will be altered ^changing the 'scale. ' A v chajt^e.of 
errors per minute to errors per 120 seconds will double t'h§ value 
■ of chi-square, * , - , * 

Host proportions^ encountered will be of the form- 



where p is the frequency in the cell defined "by Jqbk r*^and column 
c fcnd where n, c is the total frequency for .colwyrf/i^" *To convert a 
proportion to a r frequency merely requires theft the proportion 
be weighted by n. c . While contingency tabl.es Containing propor- 
tions are ofterkjjiWe interpretable, a thi^s^art must^ be calculated 
using the' frequencies from which the proportions. were determined, 

Additional Issues 

* 

i 

Further research regarding the properties of chi-square have 

been conducted since the publication of .the Lewis and Burke papetf* 

Methods have been developed to strengthen the chi-square test* 

■ K v 

Also,- closer examination of its properties, such as the use of a 
correction for continuity t 'hav,^ been conducted* Perhaps, one oT the 
best '.papers on this '.subject w^s written by Cochran 095*0* He 
presented methods for dealing vith some specific contingency t&bie 
designs and probability distributions. Id addition to the*pr£vi- 
ously mentioned recommendations , regarding minimum expected *values, 



he ' discussed testirtg . goodnes-of-f it in different distributions, 
degrees of freedom in 2 x ?N. tables, and combining 2x2 tables. The 
remainder of ,this paper deals with further issues in the usfe of 

cMS -Square. * v * 

* * , '- * ^ 

# ^ • . r * i 

Partitioning " % */ - 

At abpiit the same time th?t Lewis and Burke were writing* the 

' > ' ; : j . 

- first 'extensive work on the partitioning of an 1 x J contingency 
'"table into components was beirig conducted ^ by Lancaster (19^, 
19£0), He demonstrated ^that a general 't^erm of a multinomial .ciaij be 
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I 

1 1 



reduced to a series of binomial Jftrms, each with .one degree of 

I - • - ' • 1 .: 

freedom, Irwin (19^9) presented a formula for exact partitioning 
which was simplified algebraically by Kimbal (195*0 for easier com- 
DUt/ation, In 1960, Kastenb'aum generalized the partitioning' pro- 
ce?)ure to handle cases where some of the desired partitions con- 
tained more than one degree of freedom," Castellan (1965) reviewed 
^hese partitioning procedures and agrued for their use in place Of 
/constructing a series of 2 x 2 tables bas£d on , the following two 
points . > 

First; in setting up thp full contingency table, it is assumed 



that, the* marginal -totals represent the population values. It is" 

* . 
more likely, that the marginals for any 2x2 table, taken from the 

full table, will, not adequately .reflect those population values. 

Instead, they will reflect a *populatipn different from other popu- 

1 



lations generated from the same table,* There will be as many popb* 
lations represented- as there-are ^x 2 tables produced* 



-Second, f6llowing the prgoidure Castellan presented, the 2x2 
tables are additiVe* The sum tjpein individual 1 chi-square values 
equals the chi-square value for the original table. This indepen- 

dence of tables produces uncor^elated, chi-squares and thus allows 

■ * ' * ■ 
for more meaningful interpretation, ' * , 

Bresnahan and Shapiro U966) examined methods for partition- 
ing, including the methods- ft>r determining possible partitions,. 
They concluded that all' forms of a- partitioning follow tfrree basic 
rules: 1J each tfell appears alone once aftd only once, 2),th£ same, 
combination^ of cells appear onlyN^nce, and 3) the dividing flings of 



* - 21 ^ 

a partition do not hold for other partitions. Following these 

* 

rules* additional partitioning schemes may be employed, - They 
derive a general equation for the ehi-square whie^ may be applied 
to any tabl*e that may .result from partitioning, The equation for 



(s) 



an J' x J table is written follows:, 

y\ * ji. '«L', -g- 

where/ / 

2 s Mi 4 nJmbzr of rotJc m *Me Partition eJJ -^tt * 

■'•../ - ' ' ; ' / * 

, ft? * Mi nanj>«r- of.co/umnj m -+A t - partition* 4' h>J$k. i 

* - * " # - 

Eh * jC^< M * Sin * i ^ 

- * - * o " - - 

' / : * , ^ ' ' ' 

* , Br^snahan and Shapiro advocated the use of .this formula Uij 

cBse^ where some cells have low expected values, stead of pppl/ 

ing data or discarding^it^S^rsi^e the low txpectefd values, one c/n % 

calculate, a chi-square based on the table configuration with ade- 

* 4 * " 

quate expected values. This value will be the Contribution of that 

i . ' ' \ ' ■ \ v * \ * 

part of the table to the chi-square for the- entire table, * lit 



2^ 
***** 



Scheffer : C1 973 ) 'has taken exception to .the use " or .rthese 

' ' ' ' $ 

methods of partitioning* claiming that they do not actually €est 

the questions of interest, .For example, a^£*x v *l lable may be par- 

i 

titioned, into ;three .separate tests, each with one degree of, free- 
dom. 'Schaffer then dempnstratetf - that to test the first of the 

/ ■ ¥ 

thfee resulting hypotheses actually* entails testing that all three 
partitions do not contain significant differences against the 
alternate hypothesis -that the first partition is significant and 
thftt the pt)\er two are not. This results from the fact ttiat the 
data 'from the entire table enter the calculation for ^portion of 
the table in the determination of the expected vjjf3je*. She there* 
fore contends* contrary to Castell&n*" the^oata -from, the entire 
table should not. ehter into a - partition^since the 'test -produced is 

not the statistic desired", v * \ 

- ' ■ ' V. " ■ * 

1 On the^bas^s q\ thfs. argument-Schaf"fer .proposes th^ use^of the 

likelihood ratio statistic. . Though it does not partition exactly, 

a its use overcomes the problem of tefsting "inappropriate* 

hypotheses, Schaffer notes that' while there is H .,no evidence\for the^ 

^ V - J. 

superiority of one method- over another; "Pearson's methotj has his^ 
toricar priority and a greater ease-of computation, i * 

PEegardless of which method ftne uses, partitioning - Iricre^^i. 
thef amount of information one i& able, to glean from the data, ;if 
the partitions are orthogonal to one another, the information ren- 
dered from ea&h partition does not overlap i "with any other*. /How- 
ever; Schaffer*s t paper presents art interesting quandary- i ; 1 

If onfe Requires* a t^t of * a partition^ independent* of thef 



structure of the rest of the partitions, v then one* mtist use the 
Log-likelihood ratip as she proposed; The, lack of additivity of 
/the- likelihood r£tlo ;nay not always tie problematic* Often, only, 

* ' J * 

one partition is meaningful and/of accounts, for much of the total 'X' 
'In such cases* the choice between t % he use of the log-likelihood 
statistic and chi-sqtfare rests on the alternate hypothesis that is 
of iriterest^ If one wishes to test a single partition* tot homo- 
geneity against the 1 hypothesis th£t*it is not homogeneous and the 

/- - > i ♦ * ' 

rest/of the partitions are, then chf-squar$ is appropriate. If the 

test is -bo be'done completely independent of the structuc£ jof. the 

f ' ■ : * \ 

rest of thfc tabl t e,vthen the log-likelihood ratio is .the method of 

choice^ The log-likelihood ratio has bee,n proposed for qse'-in mare 

-than the analysis of ''partitions as will be discussed irt tM next 

■ - : ' * 

section, , „ 



Log Likelihood Ratio • 

~~ ^ " . , ■ 

An alternative procedure to calculating X 2 to* tejst a* 
hypothesis concerning a multinomial is "the use of the likelihood 
Vatio- statistic. It 'is a maximum likelihood estimate labeled G 



and .defined as, 



. v * 

1 "v" * " 

In their test or/ discrete multivariant analysis, Bishop, Fin e- 
berg/ and Holland, (1975) used log-linear models, as opposed to 
additive, models for contingency table analysis. As a summary 
statistic they stated a preference for ' Maximum' Likelihood^ 



Estimators ('MLES) on theoretical- grounds. Additionally, practical 
-reasons for" the use of this .procedure were given: 



1. Ease of computation* for linear models, 

2. MLEs satisfy certain /marginal constraints they cal.l -intui- 
• tive. . 

St "The method of maximum " likelihood cafi be .applied 

'•'/■' ' '' - ' 

directly to ^multinomial data with several Qbser^ed * . 

t i . # 

cell values of zero, and almost always produces non- 

zero estimates for- such' eel Ls *(^n extremely, valuable t 

, property in small samples;)" (p. SB),, " f 



, They furtheP state, 

H HLEs necessarily .feive minimunT.values' of G 2 it ^is 
appropriate to use G 2 as a surnmar^ statistic ^although , * 
, the reader* wil*l observe,- that, those samples' where we 
compute both X z /and <? 2 . , the ^difference in .numerical 

value of thar*two/is seltdorp large enough to be of practi- 

f * 

- cal imp©rtatfce n y (p* 126}^ 4 " * 

There are caseis where the likelihood ratio statistic may be^ 
.preferred over chi-squ^^.- Such, may oceur, when some* 'expected 

values are quite small £>r where^the contingency table contains a 

* > * ■ . ■ 

structural zero* This occurs when a design contains a cell which, 

can never logically, be filled. Bishop. Fineberg, and Holland offer 

the etfample of a classification of type of surgery by sex*. The 

3 ^ " * £ ' 

cell define by male-hytfter^ctomy wptald never contain an entry,. 

Sevenal investigators have c&mpa'red X 2 and G 2 Jjfc;§ .variety of 
research situations* Chapman (1976) provides an ^overview of much 



* of this research, including the work %f deyman'aijd Pears<Jn*'(1931), 
Cochran (1936}t Fisher (1950), Goo^ r Crt>vjBr* f and Mitcl^U £1970),; 
and We$t and Kempthopne (1972).* From these comparisons, neither of- 
the two procedures emerges a clear favorite', Wfcw) one method ( is 
■ better in some respect than the other, it seems to result from a 
.particular configuration of sample si2e t number of categories, 
expected values, and the alternative "hypothesis, ■ If a , general 
statement were to be made, it/would ap'pear that the Lag-likelihood 
ratio statistic tends to produce.' <^dser approximation to the X 2 
distribution in many cases. But this statement must be regarded 
- with two considerations in mind. * 

As most studies on this matter are confined to examining a few 
of the many possible crqss-classif ica^Lon designs where these two 
statistics might be use*, such a statement must be deemed tenta- 
tive. .In some situations neither measure is preferred; over the 
. otfter. In other cases a slight modification in design or sample 
* si2e may equali2^. the performance of both statistics, t As a .result 
it is very difficult- to, synthesize this-collectioo pf work* in-order 
v to reach a definitive recommendation valid for all research, or 
etfen k majority*- • ' , * 



rflso, the actual dif fferences* .observed may be sa, small that 

they are inconsequential to the researcher, ,A AsAn the debate over 

* ■ ' * .^N 
the matter of expected values, before a*$iecision can' bfe made 'one, 

must place the question within th^ content of actual practice. The 
> * _ ■ J . * 

more one*s research demands precision, the more closely one should 

f v * ' 

cpnsider any deferences in the statistics one' may etrfployt 



... ' ■ ^ ■ : 

\ _ * 
Further^ ooe should look closest at the research using' conditions 

jyjnost, similar to, oije*s own design. 

Correction for Continuity v * 

In a single paragraph/ Lewis, and Burke v - present , the correction 
for continuity noting that it is justified" only in the case of a 2 
x>2 table, Tbeir treatment o>f the subject has the air of a proven 
method which is , utilized without question. But questions have' 
arisen since * Lewis and Burke regarding the appropriateness of. its 
use, * x 

Since categorical variables are discrete and the chi-square 
distribution is continuous, a compensation can be macfe by adding 
or subtracting 1/£ to each .obsefved frequency, so- as to move the 
observed valufe closer to the expected value. Thus it become^ nor* 
difficult to reject the hypothesis under test. Symbolically, ''the 
corrected chi-square-is writtep as, 

A c L L £ 



In the case bf the 2- x 2 table where 



-/ \ X/, X*/ x^ , 



• i 



(8) 



/ . 

the correction proposed by Yates (193^X is calculated as 

\> A/ (|X„VX»X.,|-f7 . • ■ 

S** * X/.X,) X*> X>2^ 1 



On 



8 



. .... J , * 

The analytical derivation of^the cdrr^ction expressed in (9) is 
given by Cox (1970), \ 



, The disagreement over the use of this correction is Eased not 
on its theoretical grounding but on its applicability. Plackett 
(196*0, confirming empirical results of;Pearson J ( 1 9^7 argued that 
the correction %b inappropriate if the data come "from independent 

< -; « ' 

binomial samples, frizzle, (1967) extended Placicett^s results to 
the general case concluding that the correction is, so conservative 
it is rendered useless for practical purposes, 

_ Supporting^ the use of the corection Mantel and * Greenhouse 
(-1968) have taken exception to the views of Plackett 'and others, 
They base their objection on two points. First, they state that 
the proper jnodel' for a 2 x 2 ta$Le is" a fixed marginal total model. 
In siJch a model the correction's not 6verly conservative,. Second, 
the correction improves the probability estimates except in extreme 
cas'es, Such cases occur when the hypergeometric (or binomia^dis- 
tribution deviates fr6m symmetry beyoftd some fairl<jt extreme 1 level, 

Piri£ and Hamdan (1972) attempted to Btradl* th^ controversy 
by deriving corrections for continuity for unconditional, that* is 
random marginal, models, ' For the ? x 2 test of independence they 
arrive at a correction of 1/2 instead qf N/2 written' as* 4 , 



= " y 1 y % <* W 

C A/. A./ A*. A.^ 



• ""if- I 
the probability levels resulting fronf the use of this correction 
fall between those produced by the uncorrected, statistic ^8) and- 



the corrected -(9). 

■'■ / > , > 

♦ This issue was next addressed by Conover (1974a) and several 
short comments that immediately followed his article. Elaborating 
on a stance^ he had takes in 1971* Conover proptosed that the correc- 
tion for continuity should only be used, in 2 x 2 tables If the row 
and coljumn totals are non-random and either one or the other pair 
of the rpw or coli^an totals are equivalent to each other.' If this 
is not the case, Conover maintains that the correction is overly 
conservative. In his response, Mantel (197*0 agreed that a fixed- , 
marginal model is^9ppr*apriate and proppsed a separate correction 
for each ^ail of £he distribution. Conover (197*1*) concurred with 
this me]tB£>d and recommended * it t be used in place of the Yates 
correction when ttffe tabl^ totals are non-random. p * 

In the subsequent^ papers Mi^ttinen (197*) agreed with Conover* s 
position. farmer, Grizzle and Sen (197*) presented simulation 
results Which- $up(g*rt the contentio^ that when the column totalis 
are non-random • an uncorrected c hi -square is^ to be preferred over 
the more conservative corrected procedure. 

More recently, Everitt (1977) recommended the use of the 
correction but offered no support, for hfs decision, Gpmilli and 
Hopkins (1978)*. on^the other hand, hav-e presented results fi;om a 
Monte Carlo study confirming, the ,dtance taken by Conover, et al. 

Their results demonstrated that a, *Yates correction decreases the 

* ^ • 

* * * * ».* . , > 

accuracy of probability statement , when either,, or bothj of the 

margins are not fixed. _ ' 



* j + 

The consensus seems to be that the correction for .continuity 
becomes dverly conservative when either or both of the marginals in 
a table are random,* As this is often the case in social science 
research t it would appeai*~that the use of the correction should not 
be given the blanket recommendation that often accompanies it* » If 
strong conservatism is'jiesired and/or the marginal totals in the' 
contingency table b^ing *anal£ze£ are fixed values, then th4 Yates 
correction should be applied, However, in all other cases one must* 
be cautious in its use as the correction for continuity will pro- 
duce very conservative ^probability estiih^tes, Wherj a correction is 
desired ai?d jthe table being analyzed does not have fixed marginal 

values, the' work of*' Pirie and Hamdan should be considered - cara*- 

/ 

fully* *- " ' * w • , * 

Comparison of Two Independent CM -Squares 

Situations may occur in which one, may want to test the equal- 
ity of two independent chi-square Values, Knepp and Entwisle 
(1969) t have presented* ip tabular form* the one and five percent 

\ * 

critical values for this comparison for il = 1 to 100* They also 
mention a normal approximation calculated as , 
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(ji) 



V 

wliere x]*and are two independent sample chi-square values* each 

.with degrees at freedom* The ^tatistic z is approximately dis- 

tributed aa- a unit normal variable* 

t * \ * **** 

D'Agostino and Rdsman (1971) have' offered- another simple 



normal approximation for comparing' two chi-square values in 'the 
form of t - v ' 



to 



This approximation was tested by Konte Carlo ^methods and 'found to 
he Quite good for cqses withy > 2* For V = 1 the researcher must- 
use Knepp and Entwistle^s tabled values of 2*19 f or a = .05 and 3*66 
for a- *01, D*Agostino and Rosman also note that for V > 20 the 
denominator in (11) makes little difference and 




may be used in place of (-11). 

Comparison of Individual Proportions ^ 

The chi-square procedure, as Berkson noted In 1938* is an 
omnibus test. Ik the case of a test for homogeneity among K^gFoups 
classified by J levels of the dependent variable A, the *hypothecis 

r 



Under- test Is that 



P.(A. 



p(a t 'i 



P.(A.r|&) 

f*(A,|G*) 
PCAr | O 



'piA,\& K ) 
P(Ar\&*) 
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against the alternative that K is* false. If the hypothesis is 
rejected, one would like to be-able to find the contrasts among the 
proportions that are significantly- different from zero* V 

This may be accomplished by .a. well known procedure, which 

■ . -32 



allows one to* construct simultaneous confidence intervals for all 
contrasts of the proportions in the design, across groups, while 
maintaining the specified Type I error probability. The method is 
* an extension of Soheffe^s C 1 959 ) theorem which is used for the con- 
struction of contrasts in the* analysis of variance.* 

If a linear contrast in tfie population proportions in a con- 
tingency table is denoted as ^ , the sample estimate is % and is' 



defined as 




(I'd 



where ^ is the proportion in group k and £a^ = 0, The limiting 
probability is (1 -a)/that, for all contrasts. 

i 



where 



and j t ^ 1ST the (l'-a i th percent value from f the chi-square 

distribution at K - 1 degrees of fredom. 'Some of the earlier work 
with this procedure may be found in Gart (1962), Gold (1963), and 
Goodman. (196*0, 

The only drawback to this post hoc procedure is its lack of 
pqver relative to a planned set of contrasts.^ In place of the use 
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of X followed by post hoc exploration using the confidence inter- 
val defined above^one may employ a series of planned contrasts, A- 
more powerful procedure results from the use of a Bonferroni type 
critical value where the Type I error probability Ts- spread* over 
just' the contrasts of interest. Such' a value may- be found in* the 
table given, by Dunn H961X. 'The value' ^^-l*i-a ln the confi dence 
interval is replaced by the value taken from Dunn's table at 0= the 
number of planned contrasts and ^ 



Analysis of Ordered Categories 

In spite of its usefullnesst there are conditions under which 

v 1 * 

the use of Pearson's chi-square* although appropriate* is not the 

optimum procedure. Such a situation occurs when the categories 

* > * , * * 

forming a table have a natural ordering. The rvalue of the statis- 

■ 

tih expressed in (if) Will not be altered if the rows and/or columns 

in- a table are permuted. However* if ordering of the rows or 

columns exists* their order cannot Meaningfully be changed. This 

is Information which chi -square 1 is, not sensitive to.. Insteadt the 

researcher may choose among three alternatives. 

e ■ " * \ m 

If both rows and columns contain a natural ordering* two 

methods are available. The first is a procedure taken from Maxwell 

(1961) as modified by Marascuilo and McSweeny (1977)*. It is used 

j 

to test for a linear trend in the responses across categories. 



The first step is to quantify the categories using any s arbri- 

trary numbering system. As the method i3 independent of the 

r * . 

* 

numbers chosen, both Maxwell and Marascuilo and McSweeny recommend 
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number^* jwhich simplfy the copulations such as the linear eoeffi* 
ciepts in a table of orthogonal polynomials* These coefficients 
are then* applied to the marginal frequencies' to produce the »sums 

and' sums of squares for use in calculating a slope coefficient by 

' *■ \ 

• ! * * 

the usual formula, "'*-'*•' 



Under the assumption that $=*SL t^p standard error of ^ is calcu- 
lated as * / 

$ E s * rfr-' f '• ' '■ '■ '■ 'Of) 

. ■ f (v-/) S>,, • 



Then the hypothesis of no linear tr€TTd may be tested by 



X - 



5^ 



A decomposition of the to£al chi-square for the contingency table 
is obtained by taking X 2 (total) - X 2 (due to linearity)* X 2 (resi- 
dual)* Thi? may often be a more meaningul 'analysis, 

A second procedure involves the use of Kendall's (1970) rank 
£au> corrected for ties* If the observed tau is statistically, sig- 
nificant, the •hypothesis of no association is rejected* In addi- 
tion, the statistic itself is a measure 6f association or array 6f 
the data* Further comments arr contained in the section "of^meas- 
ures of fl association. JWfen one of the two variables^ defining p 
table are ordered, pruskal and Wallis f (1952) non-parametric one- 
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Si 



way analysis-of-rvarismce procedure may be utilized to test for 
^quality of distributions/ , ' 

Consider the case 'of an J x It contingency table where the 

dimension I is defined by mutually exclusive* ordered categories. 

The £ruska"l-Wallis statistic is based on a simultaneous comparisPn 

V 

of the^ sum-of the ranks for the K groups. To*apply the statisltc 
in the case of v an I x K table the frequencies within a category 
■along dimension I are considered to be tied arT therefore are all 
assigned a midrank valued Qne thgn sums, the ranks Qoorss I, within 
group k, to obtain the Summed ranks used in calculating^ the statis- 
tic. , 

; 1 . , _ 

Measures of Association , , *- r 

As a final, and important note, a few words must be said about 

4 

measures of association. It needs to be remembered that the value 
of a chi-square statistic is a function of sample size. To double, 
the size' of a sample* barring sample T to-sample fluctuations* will 

T 

double the size of 'the associated chi-square. To compensate for 
this, the 'data analyst should always calculate an appropriate meas- 
ure of association. To report probability levels alone is 
equivaleot to reporting the sample size as an indication of the 
results. A proper measure of association should be included so as 
to al,low for judging'the practical, that is the meaningful signifi- 
cance of the findings. • While a proper treatment of this topic 
.deserves a paper unto itself, because of the importance of this 
subject, an outline of the main measures will be included here. 



We begin with the general case of an I.x J contingency Stable, 
f the data; are generated from a single sample, .then the proper 
t^st is -one /of independence and a measure of association is the 
mean square contingency coefficient. Designated* as it*s sample 
estimate is calculated as - " 

<*■ J*' , 

As the maximum value may obtain, , is the minimum of 1-1 or 
a correction for this is , * • 

and is. referred to as Cramer's measure of association (Cramer, 
19«6). 



In the cae o£ a table generated from K-sampJes, thfe proper 
measure of association is given by the work of Light and' Margolin 
(1971). It is a ratio of the sum of squares between the K groups 
over the* total sum of squares/^ Their measure , R 2 ^ * i£ tested for 
significance by a chi-square statistic calculated as x£ M *(N-1X(I- 
1)R 2 tu which is tested at mMI-1 > ), degrees of freedom. Light 
and Margolin have, sftown that their statistic tends to be larger, 
and therefore more powerful than the ordinary chi~square in the 
analysis of a K-group design, L 

■ ■ j >• 

When thS frequencies of the k-groups are cross-classified by a 
dependent variable which is ordered, a more appropriate measure of 



association hps recently been proposed*" As noted earlier, this 
'mod'el "is' analysed by a Bon-parametric One-yay AH0VA t Can% Maras- 
,btiilo\ ^and SeVlin ^(Note 1> have proposed a measure which is the 
ratio of the calculated test statistic to the maximum the statistic 
can reach**. Their measure ranges from zero to unity and it is t 

interpreted ju t st as eta .squared is in the parametric* ANOVA* 

# . * 

" ^ If fcoth variables are ordered, one is presented with*a variety 
0£ choic?s"beginning with the standard product-moment correlation 
coefficient* Thp use this method is discussed by Kendall and 
Stuart (19&3*j ^nd basically- involves the assignment of a set of 
scores to each category* These pre-assigned scores may be just the 
natural 'numbers* 1 t 2, 3i *,*t normaT scores, or a normalized score 
using relative frequencies of the margins as cutting points for 
assigning values from the normal distribution* The chief disadvan- 

■ ) 

tage of this method centers around the fact that the scores are 
assigned arbitarily and the measure calculated will vary with the 
scoring system chosen* 

The most appealing ^measure in this' case, may well be Kendall^ 
measure of disarray* tau (Kendallt 1970)* Its use in Ordered con- 
tingency tables i3 illustrated by Kendall in his third chapter* 
Because those data in the same row or column of a table are con- 
■ sicfered as t neither concordant nor discordant in relation to eaoh 
other, but as tied, tau corrected' for ties t t c , must be used* A 
competitor to taij has been proposed by Goodman and Kruskal in the 
first of their three extensive papers on measures of association in 
cross classification (Goodman and Kruskal, 195*1; 1959? 1963). Hie 



measure* v t is the same as Kendall*s tau in the numerator* The 
denominator is the same except in that it excludes tied values* 
This means that in all cases tau < gamma* The use of tau is recom- 
.mended becasue the inclusion of Jthe tied data ia a more conserva- 
tive method and tau approaches the normal distribution faster than 
Spearman* s rank order correlation (Kendall, 1970) t 

*. 

"Jh the case oj a 2 x 2 table, the well known -measure of asso- 
ciation based on chi-square is phi and is. calculated as 

If Kendall*s tau is calculated for the same table, then it will be 
seen that phi * tau, An alternative to the tffce bf phi' iff to employ 
the odds ratio* 

For a 2 x 2 table the categories defining the tabl£ may be 
j£fr»led as A t A t B t and B, The probability of observing, B t given 
the pressence of A t can be expressed as 4 a 



p (si a) 

p(5/a). 



(3J> 



Alternately* , the probability of observing B t given the absence ,of; 



A t is 



P (slA) ■. ' . 

P(6|A) . • ^ 



A-simple measure of association^ apparently first proposed by Corn- 



field < 1 951 > » is-, t the ratitV of tfiese two odds ^ In the* sample the* 
measure" is- calculfctfti asy> . w / 



.with a standard error estimated as 



■■ . i ' t - , 

si 

A ."useful discussion of / this measure including additional 
references may be found in Fleiss (1973). The, choice between the 
two coefficients, tau and -phi, for the 2-x 2 table is not cl^ar cut 

V t * 

and the t*eader*$s*referred to Fleiss for" further discussion* 



Summary 
1 



At 80 years of ag£, .Karl Pearson* s chi-square statistic V 
remains one of the most .useful, versatile, and popular meaaUhes for 



aaUN 
s whc 



data analysis. Lewis and Burke are two among manyauthors whohave 
considered its properties' and applications and this 'paper has/ 
hopefully, served as a geneYal review of that literature* In clos- 
ing, it is interesting to note a- couple of points regarcjinft both 
the misue and use of chi-square* 

■ ' x ' 

In spite of the age of the Lewis and Burke article it is 

unfortunate to discover tfiat many of the errors outlined in their 

• ■ ■ ■» 

work' can be found in, research today* Perftaps Jtjecause the measure 
is so well known* and so easily used,, it ia- alsp easily misused* 



Care must be taken to ensure that wfren one selects a method to 
. analyse a >set of da'ta, one 'employs the method($t used correctly* 
' Thra applies not only^*c Pearson*s chi-square; but also to evert 

method used for inferential purposes* ,* * 

As .a final point, it^is important 'to remember that* as noted 
earlier, several aspects pf the chi-square 'procedures ^re still 
subject to debate such as the minimum expected frequencies allow- 
able and the. best wa^to partition -a contingency table* Very few 

* things, in* life are wnitten in granite and the "right" way to 

■ - : * 45 \ - . / * " * ' 
analyse a given set df dat is not pne of those things* The wise' 

, researcher will keep "a track of the relevant literature, seek 

advise from collegaues, and w^ll'^forsakj^tWautomatlc and mechfeni- 

■ cal application of statistical methods^ 



. * - Kt> - 
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